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COM comprising a second s«l of numerical values, each representing the number of bits required to encode a residual 

* < v f the at least one cbj — , i - 

cetermimng, for each mote OCM. the average number of bits spent on encoding mote vectors per macfob-'ocx 
= * * fefe g " esiduai GCM. the rage m >f bit pent on jcod > I d 
" hereafter steps comprise corrparing, for each motion C enom 

of bits spent £ £ > -orf per macrobfocK forth t DM* 

ares reduce- ; ^ as ss mm the corresponding average to produce a corresponding fo- hoideo'n o 
CCM.Thems - pris i ^chthresholdedmot M.ther > . n 

*res 1 s>-c^ 

ewespending :o s *<fo of the foreshoiried motion CCMS. Thereafter, classifying each of the run-lengths of zeroes info 
gt medium run-iengtb and long run-iengsh and determine fc eac - l <. 

iV * ! * < t r K N - I" th'S way for w ' 

a correspond s s set smsters C av S fSS , fi^ H fJi!x f% comprises a descriptor for features. 

i&nn fo fee Drawing: 

Figure 1 is a table slsiing a set of MPEG-4 objects known to practitioners in the HDTV art, along wits descriptors 
derived according fo pt insipSos oi the present invention, which iifostrate the use of the present invention. 

Feature extraction fjiom motion and residual bft allocation profiles 

$020] MPBQ-2 and video Inter* compression essentially consists of Wock-mstching-based motion com- 

pensation followed by OCT encoding of the residual The residua! is a convenient measure of foe difference between 
indfhapj evious frame- Motion vectors provide an \ ndicatwn of the mote characte- - 

taduai oafa fogefoe, indicate the spatio-temporal compression eonptexHy of a video 
ce larger mote vectors take more pits to encode {as do larger residuais. for the s an 
5 *-of brts expended on motion vectors and residuals directly v cat- r. E he s: > t a 

*pj©dt* oi a video sequence. The bit expenditure «s readily determined from the compressed 
• — -» k(V '■' 5 tc c? decoding (Variable Length Code parsing} and no inverse DOT. The bit axpen- 
2 , sf ip-size, are readiy computed measures of spat o-tempora compression c > 
s v < ~on of foe compression complexity depends on foe spatio tempore characfo r j >t ! c^ - 
s d sfeibute of spatio-temporal complexity can be used as a natch 

- sqwences. 

18021] Accc.;e»vj ; c i forfoer aspect of the present invention, a bit allocate based descriptor is constructed for 
- 1 s. i >r each object/frame, two 'compression complexity matrices" are const lasted > 
i ^ requ rec for encoding foe motion vectors and the nurrfoe- oi a* o ' ; - soo r ' 

6 e su i ^>ock in the object /frame. 

£0022} Thus C m = {R and C m - {R ^tf.D) are the rate matrices corresponding to the motion vectora and 
^ . GhMuitotor Parameter Q P for each of the blocks is also stored in a matrix Q, For sirT5- 
cfo are considered, the bit allocate based descrpfor fo- a name , it-o - 

■wing steps, 

l if a macrotSock of foe P -frame is encoded as an fntra Block, then its motion vector bit expenditure is set to z&xi 
arfo rts tesxsi a - a -s set to foe biis spent on the intra coding. This is done because fotra-ooeSng can be 

interpreted a- jam bbck as a result of the motion compensation, followed by codl ng of the differ- 

ence (residue) & * - s i ; ock and foe block befog encoded. 

Z aifocuG^ it > pend fore is not directly affected by quantizer step size, the quantiser step sixe 

effects foe res-cr, 1 • < - -vture cfirecfiy. Therefore, the quantization step size value is inofuded as part of the 
descnpw This vaiut onerert for eacn macromjck m which case a QuanSzer value for each btockfrow 

wouid be included as part of the descriptor, for example, in foe form of a matrix.. 

S TN averags - sp-ent on mote vectors per raacrcHoiock C^, w of foe framefofe|est can be caicu- 

Sated from C m . That is, where M and N are measured fo numbers of 16x16 macrcfoiocte (eg. for a QCIF 176x144 



Akiyo: { 10 Frames per Second) Object Number 0 

Size 11x9 m macrocodes (Macrobbck size 16x16) (Background) 

00 0 00000000 

00000 0 00 0 00 

0 0 0 0 0 0 0 0 0 0 0 

0 0 0 0 0 0 0 2 0 0 0 

0 0 0 0 0 0 0 0 0 0 0 

00020007600 

00400 0 00080 

0200 0 0 00000 

0 0 0 0 0 0 0 0 0 0 0 




Object Number 1 size 9x8 (Ateiyo's head and shoulders) 

0 0 0 10 21 15 0 0 0 

000 4 2 2 0 00 

0 0 0 22 16 18 4 0 0 

000 14 2 4200 

0 0 6 4 22 22 2 5 0 

0 46.2 2 29 6 0 0 

0 2 0 2 2 2 6 0 4 

000 0 2 2220 

£0030] The average number of bits per macro**** is 3.77. a sisnSicantly larger number than thai associated with 
"backgroursd." The matrix after thresholding in this case woufd appeat as follows: 

0 0 0 10 21 15 0 00 

0004 0 0000 

0 0 0 22 16 18 4 0 0 

000 14 0 4 000 

0 0 6 4 22 22 0 5 0 

046 0 0 29 6 00 
0000 0 0604 
000 000000 



Cmsi 3 SimiSarty. the » isi length representation contains mu* more data and wayid appear as feiiows. 
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3 10 0 23 0 25 6 4 0 2 0 2 6 22 0 16 0 18 0 4 5 14 
0204 0246 0 4 0 22 0 22 0 2 0 5 2 4 0 6 0 2 
020290614420 20 20 21 Threshold T -0. 

Corresponding r asouaS bit expenditure for Object No. 0 and Object No. 1 wowW be as mom. 
Residual Bit Expenditure Object Number 0 
0 0 0 0 0 0 0 0 0 0 0 
000 0 0 000000 
00000000000 
0000000 13 000 
00000000000 
00080000000 
00000000000 
0 160 0 0 0 0 0 0 0 0 
0 0000000 0 00 
Object Number I 
0 0 0 5 0 28 0 0 0 
0 0 0 18 24 22 0 0 0 
0 0 0 69 55 25 14 0 0 
0 0 0 56 96 61 21 0 0 
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0 0 0 28 9 26 31 17 0 
0 23 0 ? 36 819 G O 
0 26 0 5 63 5 14 0 0 
0 0 0 0 18 475125 0 



1063*1 The average bits per marcrobioek associated with residuals for Object No, 0 is 0 3? and for Obfetf Na i is 
13.08. Ins rasp j fef ^.itsisorss woidd be as follows. 



Object No. 0 

40 13 20 8 20 16 20 (Threshold T 0.37) 



Object No, 1 

S 28 6 IS 0 24 0 22 6 69 0 55 0 25 0 14 5 56 0 96 0 61 0 21 5 28 I 26 
0 31 0 17 2 23 2 36 1 19 3 16 2 63 1 14 6 18 0 47 0 51 0 25 1 
(Threshold 7= 23) 

|0OS4j Ths sroage "Monitor from News Sequence also has besrs analyzed 

Monitor from News Sequence (20 Frames per Second) Size 6x5 

Motion Gorapfwdiy Matrix C„ 

0 4 0 4 15 16 

2 26 7 33 6 20 

0 4 32 0 26 16 

0 0 2 2 26 21 

0 0 0 2 2 0 

Average BIt$/MacFobfock ~ 8.86 

9 
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Monitor 

Matrix after thresholding 

0 0 0 0 15 16 

0 26 0 33 0 20 

0 0 32 0 26 16 

0 0 0 0 26 21 

0 0 0 0 0 0 

N^20-N,,= ?;N tt!f =l;N Jf *0 



Residual Complexity Matrix 



0 


19 


0 


49 


169 


33 


7 


82 


33 


49 


248 


32 


0 


24 


26 


0 


76 


0 


0 


0 


48 


36 


64 


9 


0 


0 


0 


14 


20 


0 



Average Bils/Macroblock = 94.36 QP = 12 



1 " ' shewn above are ttescrfrtofs for MP6G-4 objects. It can be seen that the tow motion 

* *- " ^ of the sequence *AWyo" are very easy to distinguish iron : high i. ; «tj oo sc I • uch as 

quence 

ffi03Sj Tabie i ■ - shows *»amp!es of spatso-tempcrai complexity for MPEG-2 frames. The properties of an 
uaHynetashor « of MPEG-4 objects, thus a van i , - Hwever, 

N Rescript* according to the present invention snabies simple and eiled . « - 
spatio-tsfflpofaay similar frames. 



TABLE 1 



SEQUENCE 


BfTS PER MACROS- 
LOCK SPENT ON 

saoTtofci 


COMPLEXITY MEAS- 
URE PERMACHO£ 
LOCK 


Fixsosi! ;?S0x48O) 


14 


3332 
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TASLE 1 {continued) 



SEQUSNCS 


SITS PER MACR08- 
LOCK SPENT OM 
MOTION 


U8E PER MACBQB- 


Cheerleaders (720x480) 


11.3 


3B82 


c 1923x1080} 


30 


457? 


Marching Band (1920x1080) 


13.8 


4317 









f « * J i k B CJT i < ! J ' I j 

« ' ' ?so ution <*ata gives na*>e< motion vectors. II is therefore important to retain tofnattcm as to 
frame size while making use of the current matching criterion. Note also that the descriptor is applicable to B-trames, 
as well as to Mra = ; sting the motion vector pert. 

|0837j The descriptors according to 9x6 present invention have been applied mainly to the MPEG 
it has readily sva s « - y«acor. into objects. 

Snce objects have more or iess homogeneous properties, comparisons making use of the present descriptors mgu- 

0 r*M object MPEG 4 sequences have been compared by usatg an object to object compar - 
e its toe basic motiori compensation information has bean usad for toe descriptor, the 

resets should be reas% applicable to any compressed video syntax that uses block motion compensation coupled with 
OCT encoding of the residua!. Furthermore, since un-segmenfed frames can also be consider tc Ot 
objects, this approach should also apply to such frames, to that case, descriptors are developed for the sequences by 
ireabng the : < ^iienoes as single objects, Since the characteristics of such -composite" objects are not horoogero , us 
any comparison wfct descriptors of individual objects is unlikely to yietd vaiid results. However, comparisons betweei ■ 
two sets of frame based descriptors wiH provide useful results. 

I0038J This work has also concentrated on MPEG-1 bit rates since a target application would be multi-madia data- 
bases in which trie minimum expected qw% is high The work has also been prtocipaity related to Mi frame 'av. • « 
2S or 30 Srames pgr second. 8 should be noted thai the motion complexity features would cnanse as a function of frame 
has bee - determined experimentally that the allocation of bits spent on the motion vectors does not 
<%ver bit rates. Only toe residual bis allocation is significantly affected by c t. \ ■ . nbi ■ i 
- itso been found that changes in rate control strategy do not significantly affect the motion pro; ■ 

1 information. These factors have led to a descriptor that emphases faatu es 
largely on motion properties. However, residua* information is developed and retained because it provides ditferem 
information that can be useful in certain circumstances. For example, if two objects have the same motion properties, 
but, with use of the same QP {Quantization Parameter), one object requires more bits tor residual encoding lb r 
other, then the former is more spatto-temporally complex than the tetter However, if toe q size tor one 

j> jreater man that tor another object but the residual encoding bits spentfcrlfce se<tofto object are 
x act, no conclusion cars be drawn regarding the relative spatio-temporal complexity of toe 
two objects That is. toe residual bit allocation does not always provide conclusive information in this regard. 
{txmj T> oi * ifues are shown for a commonly Known set of MPEG-4 objects in Figure t of toe Draw- 

ee riptors can be illustrated by referring to this Figure 1 . The sequence a i 
tors may fa se m foams and can be illustrated by two particular search procedures. 

WW The first procedure may be identified as a 'cascaded" search in which one descriptor feature sf a time for 
ie> objects are compared to successively narrow the data set For instance, assume that In 
a first stage of the search to which a fins* feature is employed, a list of 10 objects that "match- targe 
a lisi of - 50 a v.; in a second search staged different descriptor feature) 6 "matches" are obtained from the 10 items 

jndsnthe sister ^ s cascaded search. 
[0041 1 A second search procedure cenprises using weighted combinations of the descriptor features for making 
1 *■ r 1" weights to be given to different features involves complex con < n 

* it Is auras The cascaded approach provides a more straightforward approach and 

Is therefore preferred. 

[0042] One set of features i. . led search has been found to be the sequence of C" ;ras '\, v followed by 

the mn-iangtt saturs " uprising iV N w and N, f The results o* natch ng tests show that toe t jmde 
- !< r t > i' t c i « « . tin n«'e* 'h« otndcavc 

J " ' - " ^ tstiss. wh3e the second s&ge eSmioates candidates that have the sam « tfi motion 
complex but a Sff .,.-,>.. i> of mofen intensity. The matches sometimes are semamjcailv quite different from 
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35 



Si 




EP1 82 


2 


Target Object and Objects in 
Descending Order of Similarity 


Average Motion Vector Bit 
_ idft re per Macro bl < 


r " j id ; l'c *c 


3.77 


Co main e ? Ship- Hag 


2.0 


Coastguard-Motor Launch 


2.0 


n ; ' - - hip- Shit 


1X6 


N ws-Nev. Readers 


1.2 


Aksyo-FitH 


0,87 



TABLE 2 
First Stage Matching 



Target Object and 
Objects m Descending 
Order of Similarity 


Most Common Ron 
Length Category 


Frequency of Most 
Common Rvm Length 
Category 


Akiyo Head asd 
Shoulders 


Short 


5 


AJdyo Fall 


Short 


5 


Container Ship-Ship 


Short 


5 



13 



>. »- -News Readers 


Short 


4 


Flag 


N/A 





TABLE 3 — 
Secood Stage Matching 



TABLE 4 



Targe* Object and Objects in Descend- 
ing Order Of Similarity to Target Object 


Average Motion Bit 
Expenditure per Macrob- 
Sock 


Atayo SSili Background 


0.22 


s Still Background 


0.2 


Container Ship-foreground (Flagpole) 


0.12 


Container Ship-Still Background (Sky) 


0.4 


Nis*vTs*S Overlay 


0.0 


"c ■ ? Ship- Small Boat 


0.8 



PJ44] Thus, the peasant descriptor method would be useful as an intermediate MPEG-? descriptor that couSd be 
applied by a relatively simple apparatus and would facilitate computation of higher level MPEG-7 descriptors on a 
smaller dais set by a subsequent apparatus or program. 

[OW53 As a further step in employing the foregoing techniques, temporal segmentation markers may be generated 
and are associated with She in-put v ideo sequence to locate and identify partScuiar typvs of s . , 
- mo sequence may be extracted according to the present invention, 
| - , omenta tor each of abrupt scene change and fade-in/fade-out scene change detec- 

40 tk5i1 ^«PS are described in detail in an application emitted "METHODS OF SCENE CHANGS DETFC HON AND FADS 
DETECTSC < \3v'IDE0 SEQUENCES", filed prew.^ r shouio be appreciates that ; ■ i 

oppcftit sails csf me steps tor detecting scene changes without departing from the more genera; 

aspects of the present wssrstioa 

|P047| Simply stated, a preferred rneii^ of detecting scene 

1. Locate ine OOP's in which scene changes are suspected to exist by using a DC - image-based process on suo- 
* f frames 



2. Apply the bit aiiocaSori-based criterion in each of the OOP's selected in step 1 to locate cut points. 

[0048J To apply this technique to MPEG-4 compressed video, the following more detailed criteria are employed, 
|0M9] Since ? EO-4 is abject nasea blocks representing similarly located objects in two adjacent frames are 
compar « if - t -qu Mying sisp Tba temporal change in each object >s mess-- > w ;h$8d sum of 

change , ie oojests in a frame is determined, with the weight being related to the traction of 

55 me totei karnB occupied by the object. Object changes also are detected by repeating the procedure at the object 
level in each shot or scene. Changes above a threshold level indicate a suspected scene change. 
JOOSCJ ? ro bb. .a*?, to encode eecrcto* a ignifc tiy.afix 

3 I £0 s vsng a «*etf threshold with MPEG-4 data results in false dstecfor and/or fails ec 
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- - a of sits required by the m ffi object of the sequence RP >Ti . Use a sliding window to process; 
tne rate difference sequence «P f:; so as to capture fecal variation. Declare a scene change from Rj to R w if 

1 . The difference HP ; is the maximum within a symmetric sliding window of size 2k -1 and 



2 The different P onfimasl e second largest maximum in She sliding window, k = 3 and n*2 are used in 
the examples thai are iifusfrated, The choree oi these parameters depends on empi icaid fe *v ?ji as irsme rate, 
the frequency pshod frames, etc Note that the rase difference can be computed onfy between two I 
fiames/bbjects or between two P. frames/ejects, fn other words, ail the frames in the sequence H should all he 
sstfcer torptrarm , Note aso that in an MPEO-4 sequence, all the objects need not necessarily be 

" « * *t the same tone. 



f**«l , .equence of DC images {objects} X - {deOj}} is constructed where etc&j} is the DC value 

- ' ' v {object}. Extraction ot DC values from Intra coded frames or objects simple sine 

it only requires entropy decoding, but extraction of DC values from Predictive and P frames/obj M i > ; norecom 
' ; difference sequence is constructed as in the previous sequence using one of sev- 

eral possible r m t J) V) is used between two frames X and Y as defined below: 



Ode can therefore construct a sequence dzPs.KJ for the sequence and use the previously described sliding window 

Ganges 

adj£M^8aMjalhaiMssg§Jrj Bits Taken MJEasadg 66 Consents of ftesiduafc 

J00S5J it has beer? observed that when there is a gradual scene change, every block of the image includes a DC 
s fading in from a completely stack frame or fading out to a complete?* black frame, 
dtbat bit allocation profiles for DC components of residual blocks provide an ind c r 
A -istfjod o* <ada detection which is described in greater detail in a second concurrently f Oed sppli* 
cation of the present inventors generally comprises the following steps 

1. Compare the DC images cf successive i-frames to locate suspected scene changes. This has been found to be 
necessary t 2 s ecting abrupt scene changes as described above. This step neips save computa- 
tion, since a search is made thereafter for a scene change only in the signai segments in which successive l-frames 
differ widely, thus avoiding processing the entire signal sequence, 

2. For each P-frame in the regions in which there is a suspected scene change, me number of blocks with negative 
DC sompo 5nS m hs number cf biocte with positive DC components are counted. Por the UPSG-2 as 
well as the MPfcG-4 case, this step is readily accomplished from the VIC parsing, since every non-zero DC com- 
ponent will «x i z& a r umber of bits and a sign bit that indicates whether the component is positive 
or negative. Zero DC components are indicated by the run-lersgihs and thus can be readily skipped. 

c Determ *sih set* c of the two numbers obtained above versus the frame number, and determine the 
regions in which suspected scene changes have been located according to step 1 above 

4. Decfe eg <«&- sr ot negative transitions is consistently greater than or equal to 60% of the total 

r .h-n c* Osj. "ome^eiy o^dare a iae* n if the number of post tran ifi is mm th afo 

mentioned threshold, 8 should be noted that a version of a sliding window may be used in place of the stated 80% 
threshold. 
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MP&G-4 tomptmstd Video Sequersces 
FtKwSfeeQCffcn&iiW} 
Frasr.a Xts:s 30 Sajrsss/ssre 
Corap.'osiw; &ew 1*1.2 Mbps 



passes 




VOP 
Sfie 

MS '5 






QP 




Most 
sorts rots n 

tag* 


Frequency 


""News " 


Still 


11x9 


0.2 


2^8 


3 


89 




3 




« 


6x5 


1.1,43 


193.5 


3 


IS 


Short 




News 


Newsreaders 


iix7 


U 


42. 19 


3 


30 


Short 


4 


News 


T;j.; C-vsfia j 




0.0 


0.0 


3 






























HxS 


5. 48 


41.5 


2 




Long 


' ■ 




Umt 
Lsmxih 




2.00 


35.6 ' 


3 


23 


Long 


1 


CO 


SmalS 
M«orbe>s£ 
sn4 Wake 


list 


4.00 


63.8 


3 


8 


Long 


3 


CO 


Background 


tl*» 


2.36 


51.0 




30 


Medium 


4 




















Coaiawxsr 

5hS3fC$! 


Water 


1.1x8 


0.S5 


5.46 


6 


58 


Short 


18 


cs 


Shb 


^ 4 J 


1.16 


46.13 


6 


15 




4 


CS 


Small Boss 






16.4 


6 




LtMV$ 


i 


cs 


Fofegrctarad 
fFlastwis} 


11x9 


0.12 


1.79 


6 


90 


Long 


4 


CS 


S:i!i Sgd. 

SL. i 


15x3 


OA 


2.09 j 6 


29 




2 


cs 


F3ae 




2 


97 ! 6 ! 0 ' 


N/A 


0 




















Afeiyo 


Back-»f©S3ftd 


11x9 


0.22 


2.93 




89 


Leng 


3 


Aieiyo j Hsadsnd 
1 Shoulders 


SxS 


1.77 


35.3 


4 


37 


Short | 5 
1 


S 








.... f ,. 


1 



Ftg»r« 1: Proposed Descriptors for Various MFEG-4 Objects 
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